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ABSTRACT 

We calculate the angular power spectrum of galaxies selected from the Sloan Digital 
Sky Survey (SDSS) Data Release 7 (DR7) by using a quadratic estimation method with 
KL-compression. The primary data sample includes over 18 million galaxies covering 
more than 5,700 square degrees after masking areas with bright objects, reddening 
greater than 0.2 magnitudes, and seeing of more than 1.5 arcseconds. We test for 
systematic effects by calculating the angular power spectrum by SDSS stripe and find 
that these measurements are minimally affected by seeing and reddening. We calculate 
the angular power spectrum for £ < 200 multipoles by using 40 bandpowers for the 
full sample, and £ < 1000 multipoles using 50 bandpowers for individual stripes. We 
also calculate the angular power spectrum for this sample separated into 3 magnitude 
bins with mean redshifts of z = 0.171, z = 0.217, and z = 0.261 to examine the 
evolution of the angular power spectrum. We determine the theoretical linear angular 
power spectrum by projecting the 3D power spectrum to two dimensions for a basic 
comparison to our observational results. By minimizing the fit between these data 
and the theoretical linear angular power spectrum we measure a loosely-constrained 
fit of n,n = O.Sltoll with a linear bias of 6 = 0.94 ± 0.04. 
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^ 1 INTRODUCTION 

ILV ' The angular power spectrum, Ce, is a statistical measure 
. ^ ] that quantitatively c haracterizes th e large scale angular dis- 
B tribution of matter l|Peeblesl Il973l ). Therefore, calculating 
$H ' the angular power spectrum of galaxies is useful as both a 
5^ , method of data compression, reducing clustering informa- 
tion of an arbitrary number of galaxy positions down to a 
set of Ce and their corresponding window functions, and 
also since the Ce values derived from the observations can 
be easily compared to theoretical predictions. 

Calculations of angular power spectra are well known 
to cosmologists for their usefulness in studying the Cosmic 
Microwave Background (CMB), as the CMB provides a de- 
tailed and precise measu rement of the density variation s in 
the e a rly universe fe g..iSmoot et al.lll992l : iNetterfield et al.l 
l2002l : [ Spergel et al.l |2007^ . However, to study large scale 
structure in other eras, it is necessary to analyze how mass 
clusters by using galaxies as a tracer of the underlying dark 
matter distribution. 

Angular power spectra of galaxies have been calcu- 
lated for galaxy surveys with various depth and survey 
area (e.g., iHuterer et al.|[200 T: 'Blake et al."2004 jFrith et all 
I2OO5I) including the SPSS tTegmark et al, ,200^, hereafter 
T02: lBlake et al.ll2007l : iThomas et al.ll2010l ') By using angu- 



lar power spectra to calculate galaxy clustering, we study 
the Fourier modes of the galaxy distribution; this method is 
most sensitive to large scale effec ts. Recent galaxy su rveys 
such as the APM Galaxy Su rvev jMaddox et ai]|l990l') . the 
Two Micron All Sky Su rvey ("Skr utskie et al.ll2006l l~and the 
SDSS l|Abazaiian et al] r2009) have cataloged large areas of 
the sky, thereby providing enormous numbers of galaxies for 
which we can measure angular clustering. However, to date 
the galaxy angular power spectrum has not been calculated 
for the full SDSS main galaxy sample. In this paper, we 
address this deficiency. 

The angular power spectrum is useful for large scale 
clustering, while it is complemented by the two-point 
angular correlation function on sma ll scales. The two- 
point angular correlation function (' e.g., |Brunner et al hood : 
iMvers et al]|2007l : IRoss et al.lboiol 'l. which is related to the 
angular power spectrum by the Legendre transform (T02), is 
more sensitive to smaller scale clustering because the calcu- 
lation is done in configuration space where the distances be- 
tween nearby pairs of galaxies can be calculated faster. This 
makes the two-point angular correlation function advanta- 
geous to use on scales where non-linear evolution is impor- 
tant. This regime is also where the angular power spectrum 
at large £ is more difficult to measure and model, partly due 
to correlations introduced between the Ce- 
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To calculate the angular power spectrum, we want to 
find the most probable parameters Ct that could produce the 
data we observe. To do this, we need the likelihood function 
of the angular power spectrum, which is proportional to the 
probability of the data given the Ct. Though in theory we 
would like to know the entire likelihood function, calculating 
this ^,„aa;-dimensional function is difhcult ( Oh ct al. 1999). 
Fortunately, since we are only interested in the most prob- 
able Ci, we only really need to know the maximum of this 
function. 

To determine the Ce that maximize the likelih ood func- 
tion, we use the quad ratic estimation method l|Tegmarkl 
I1997I : iBond et al.lll99^ . hereafter BJK98). This technique 
fits a quadratic function to the shape of the likelihood func- 
tion for some initial angular power spectrum, finds the Ci 
that maximize this quadratic, and uses these Ci for a new 
quadratic fit to iteratively converge to the true maximum 
of the likelihood function. Once we have found the angu- 
lar power spectrum of galaxies, we can use the results to 
infer what cosmolo gical parameters are consistent with the 
measurement fe.g.. |Jaffe et al.lll999l ). 

In this paper, we discuss the SDSS DR7 data, our se- 
lected sample and subsamples, and our systematic tests and 
masks in Section [21 In Section (3] we discuss our pixelization 
scheme, KL-compression, and the quadratic angular power 
spectrum estimation method of BJK98 in detail. In Section 
SI we apply this estimator to the complete SDSS DR7, se- 
lected subsamples, and individual SDSS stripes, and present 
the results. We construct a theoretical linear angular power 
spectrum to compare with the observational results, and we 
extract cosmological matter density and linear bias from this 
computation in Section [5] Finally, we discuss our results in 
Section [6l and conclude the paper in Section [T] 



2 DATA 

The data for these measurements were taken from the SDSS 
Data Release 7, the final data release of SD SS-II. The Sloan 
Digital Sky Survey (jAbazaiian et al]l2009l ) is a multi-filter 
imaging and spectroscopic survey using the 2.5 meter tele- 
scope at Apache Point Observatory that begun operation in 
2000, and ended with the SDSS-II in 2008. The imaging ob- 
servations are taken simultaneously in 5 filters ( u, g, r, i, and 
z) as the telescope drift scans across the sky l|Gunn et al.l 
liggsj ). The SDSS DR7 covers 11,663 square degrees in a 
striped fashion. 

The SDSS DR7 also provid es photometric redshif ts and 
redshift errors for each galaxy dAbazaiian et al]|2009l V The 
SDSS has measured over 900,000 galaxy spectra and uses 
these as a reference set to find the 100 nearest neighbors of 
a photometrically observed galaxy in color-color space. The 
photometric redshift is estimated by fitting a hyperplane 
to these neighbors, and the error is dete rmined by the mean 
deviations from the best-fit hyperplane (.Csabai, et.. al,J ,2007) . 
As we require the galaxy redshift for analysis of our results, 
any galaxy without both a photmetric redshift and associ- 
ated error is not used in our calculation. In SDSS DR7, the 
rms error of the photometric redshift estimation is 0.025, 
while for our samples it varies from 0.038 in the brightest 
sample to 0.064 in the dimmest. 



2.1 Area 

We begin by selecting a large, contiguous area of DR7, from 
stripes 9 to 37, an area of 7,646 square degrees before mask- 
ing. Each stripe is 2.5 degrees wide in eta (the survey lati- 
tude), and variable length in lambda (the survey longitude). 
Typically, however, the stripes are 100-120 degrees long. Us- 
ing this large area allows us to use a bandpower resolution 
of up to 4 multipoles per bandpower when calculating the 
angular power spectrum for the full sample (see Section [321) • 
Since this area is centered around the North Galactic Cap, 
we avoid the worst areas of reddening due to the Galactic 
disk. After masking for observational effects (e.g., reddening, 
seeing, bright stars; see Section 12. 2|) . our sample includes 
18.9 million galaxies over 5,763 square degrees of the SDSS 
Northern Galactic Cap ellipsoid. 

2.2 Systematics 

The data that we use span a wide range of Galactic latitudes, 
and we have considered the effect of stellar density on our 
galaxy samples. Bright stars in our Galaxy could possibly 
obscure background galaxies ()Ross et al.l201ll ). or faint stars 
could be misclassified as galaxies by the star-galaxy separa- 
tion routine. To examine these possibilities, we have calcu- 
lated the galaxy overdensity and stellar overdensity sepa- 
rately, applied our masks, and plotted these overdensities 
versus Galactic latitude in Figure [1] We see two exponential 
falloffs in the stellar overdensity which correspond to the 
two edges of the SDSS dipping toward the Galactic disk, 
the high Galactic latitude exponential comes from the side 
of the SDSS in the general direction of the Galactic cen- 
ter and the low Galactic latitude exponential from the side 
near the Galactic anticenter, while the galaxy overdensity is 
consistent with zero at all Galactic latitudes in our sample. 
For the large pbcel sizes we use in the following calculations, 
obscuration by bright stars does not have a large effect on 
the galaxy overdensity, and at even at the lowest magnitude 
we use, star -galaxy separation i s accurate at the 95% confi- 
dence level l|Lupton et ahllioOll ) so we observe no effect on 
the galaxy overdensity from stars. 

To test the homogeneity and observational character of 
the data, we calculate the angular power spectrum sepa- 
rately for each stripe, using the method discussed in Section 
(3] If there is a significant deviation in the angular power 
spectrum from stripe to stripe, observational systematics 
might dominate over the real density variations of the com- 
bined stripe data that makes our full sample. To test for 
these systematics, we have calculated angular power spectra 
of each SDSS stripe from stripe 9 to stripe 37 after mask- 
ing, with each of the Ci including an identical range of I. The 
angular power spectra from each stripe are remarkably con- 
sistent with each other, which is shown in the box- whisker 
plot in Figure [21 and this shows that these observational sys- 
tematics do not significantly alter the angular power spectra. 
The only notable variation between stripes is that the edge 
stripes 9 and 37 have much larger error bars due to these 
stripes having the most pixels eliminated due to the seeing 
and reddening cuts. 

We have also varied the seeing and reddening cuts to 
test their effects. We have varied seeing cuts from 1.0 to 3.0 
arcseconds in 0.1 arcsecond intervals, and reddening cuts 
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Figure 1. Points in black are the pixelized stellar overdensities, 
as a function of Galactic latitude at HEALPix resolution 64. The 
exponential falloff of the Galactic disk is seen here twice, at high 
Galactic latitude we see the falloff of the stars toward the Galactic 
center and at low Galactic latitude we see the stars in the direc- 
tion of the Galactic anticenter. We group the pixelized galaxy 
overdensities by Galactic latitude into 20 bins, which are graphed 
as a box plot. For each bin, the median galaxy ovedensity is plot- 
ted in red, the end of the boxes mark the 25% and 75% quartiles, 
and the end of the whiskers mark the minimum and maximum 
overdensities in that bin. 




Figure 2. Box plot of the angular power spectra of galaxies with 
dereddened r-band magnitudes between 18 and 21 for the individ- 
ual stripes 9 through 37. The median is in red, the 25% and 75% 
quartiles marked as the edge of the boxes, and the minimums and 
maximums marked at the end of the whiskers. 



from 0.1 to 0.5 magnitudes in 0.05 magnitude intervals, 
but found that neither seeing nor reddening had a signif- 
icant impact so long as a sufficient galaxy density remained 
to calculate the angular power spectra. This is consistent 
with the cross correlations between galaxy density and red- 
dening/seeing calculated by T02 for stripe 10. Neverthe- 
less, to minimize systematics in the SDSS galaxy sample, 
we have eliminated areas of seeing greater than 1.5 arc- 



seconds and reddening worse than 0.2 magnitudes to be 
consistent with s imilar angular correlation function results 
iIRoss et al.ll2007 ] ). though ot hers have used more stringent 
cuts (|Wang fc Brunneill2012l '). 



2.3 Subsamples 

We have chosen our main sample to be fro m 18*^^-21"* mag- 



nitud e in the extinction corrected r-band (jStoughton et al] 
120021 ) ■ with the faint limit chosen due to concerns about 
completeness in the sample past 21"^ magnitude. Though 
the 95% completenes s r-band magnitude limit is 22.2 
dAbazaiian et al]|2009l ). some galaxies at the fainter end of 
the 21-22 magnitude range are not detected or unusable due 
to large errors and we choose to limit our analysis to more 
complete samples. 

We have chosen subsamples of our main sample for com- 
parison to previous results, and to test for potential system- 
atic errors on galaxy selection. We first confirm our tech- 
nique is consistent with the results from T02 up to 21"' 
magnitude, so we have separated stripe 10 into 3 magni- 
tude bins from 18-19, 19-20, and 20-21. The comparison 
can be expected to be slightly different due to the use of 
the more complete DR7 data as opposed to the Early Data 
Release results that used galaxy probabilities (T02), in addi- 
tion to the photometry cal culation difference of m agnitudes 
in SDSS data prior to DR2 ijAbazaiian et al.ll2004h . We show 
these results in Section [l] 

We also measure the clustering attributes based on the 
brightness of the galaxies. The apparently brighter galaxies 
cluster more strongly and are generally at lower redshift, 
thus we expect those to have more power in the angular 
power spectrum. We create three new samples by separat- 
ing the SDSS galaxies into 3 different r-band magnitude bins 
from magnitudes 18-19, 19-20, and 20-21. These magnitude 
ranges are sufficiently bright to minimize the systematic ef- 
fects of star-galaxy separation and variable sky brightness. 
These samples have intrinsically different redshift distribu- 
tions and luminosity functions, therefore the angular power 
spectra of these samples will reflect these differences, and 
they are also useful as an important systematic test. 



2.4 Simulated Data Set 

In addition to matching the published results from T02 and 
verifying that our results from all stripes across the SDSS 
DR7 are consistent, we performed one additional test of the 
veracity of our quadratic angular power spectrum estima- 
tor. We have generated simulated sky maps and compared 
the results from our quadratic estimator to the results from 
the HEALPi?!!] angular power spectrum estimator anafast. 
We first generated a linear angular power spectrum as de- 
scribed in Section 15.11 and used the HEALPix synfast rou- 
tine to create ten pixelated sky maps at HEALPix resolu- 
tion 2048. Second, we convert the pixel values in each of 
these ten sky maps to galaxy overdensities by using the av- 
erage galaxy density of the SDSS DR7. Third, we mask, in 
an identical manner to our treatment of the galaxy sam- 
ples, each of these simulated full sky maps to the stripe 10 



^ See |http: / /healpix.jpl.nasa.gov| 
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Figure 3. The results of our quadratic angular power spectrum 
estimation analysis of these 10 simulated maps is plotted as a 
box plot with the median in red, 25% and 75% quartiles at the 
ends of the boxes, and the minimum and maximum results at the 
ends of the whiskers. The yellow band shows the minimum and 
maximum angular power spectrum measurements determined by 
the ten anafast measurements as described in the text. 



boundary as described in Section [37lT2] Finally, we combine 
pixels to produce a degraded map with Healpix resolution 
256. With these degraded sky maps, we calculate the an- 
gular power spectrum by using our quadratic estimator to 
these ten samples out to I = 510. 

We also use synfast to generate the same ten maps 
at Healpix resolution 256, and calculate the angular power 
spectrum by using HEALPix angular power spectrum esti- 
mator anafast to provide a direct comparison to the results 
from our quadratic estimator. At resolution 256, we use the 
recommended I = 512 for synfast and anafast, and per- 
formed a standard analysis with anafast of the entire pix- 
elated sky with no regression, masking, or cuts. We show 
these results along with the results from our quadratic esti- 
mator in Figure [3l Both estimators show remarkable agree- 
ment, despite the fact that anafast is operating on a full 
sky map and our quadratic estimator is operating with the 
Stripe 10 window function. As a result, we feel our imple- 
mentation of the quadratic estimator and the results we de- 
rive are robust. 



3 METHOD 

Angular power spectra attempt to measure the multipole 
moments, £, of a t wo dimensional di stribution, in our case 
the galaxy density l|jaffe et al.lll99Sj ). However, since pho- 
tometric surveys only observe portions of the sky, all multi- 
pole moments cannot be individually determined l|Tegmarkl 
Il996l ): what is measured instead is a group of them simul- 
taneously. Multipole moments are grouped into contiguous 
bands, called bandpowers, and we make the as sumption that 
all m oments in the bandpower are equal fe.g.. lHuterer et al] 
booil ). The same computation is subsequently performed on 
the bandpowers as they would normally be on the individual 
multipole moments. This also serves to reduce the compu- 



tation needed for the calculation (|Borrill|[l999l ). First, we 
calculate the angular power spectrum by using the smallest 
bandpowers possible, and these bandpowers are averaged to- 
gether into larger bands to improve the signal-to-noise and 
reduce errors (BJK98). 

Typically, Fourier methods are used to describe the dis- 
tribution of a continuous population, but the galaxy distri- 
bution is discrete. To calculate an angular power spectrum, 
we transform the discrete galaxy counts into a continuous 
galaxy density distribution. To do this, the sky is divided 
into "pixels" and the galaxy density in each pixel is calcu- 
lated. The calculation continues in the same way as it would 
with a CMB temperature map (e.g., BJK98). Smaller pixels 
can tell us more information about the angular power spec- 
trum, but the comput ation required is highly dependent on 
the number of pixels (|Tegmarklll997l ). 

In this section, we first discuss how we pixelize and 
mask the data, followed by our selection of bandpowers in 
Section 13.21 In Section 13.31 we extensively detail how we 
calculate an angular power spectrum, beginning with KL- 
compression, the quadratic estimation technique, and the 
computational difficulty involved in this calculation. In Sec- 
tion l3.4l we describe how these bandpowers can be combined 
to produce higher signal-to-noise angular power spectrum 
estimates, and how to calculate the window functions asso- 
ciated with these measurements. 



3.1 Pixelization 

We have chosen to use a quadratic estimation approach to 
calculate the maximum likelih ood of the angular power spec- 
trum using KL-compression ijBondl 1 19951 : iBunnl [l99^ . To 
force the discrete galaxy observations into a continuous pop- 
ulation, the sky is pixelated to determine the galaxy over- 
density per pixel. 

We pixelate the sky using equal area pixels and remove 
areas that are outside the survey geometry, or have high 
seeing or reddening values. Any pixels with less than 75% 
usable area are not considered in the calculation. In the end, 
the galaxy overdensity is calculated: 



(1) 



where d is the galaxy count in pixel i, G is the average 
number of galaxies per square degree over the survey area, 
and Qi is the area of the pixel in square degrees. Thus the 
data set of possibly millions or more galaxies is reduced to a 
set of pixels that encodes the galaxy overdensities. The ac- 
tual choice of pixelization technique, however, is important; 
and we have tested two different pixelization schemes, each 
with its own advantages. 



3.1.1 Pixelization Schemes 

SDSSPix is a hierarchical, equal area pixelization scheme 
developed specifically for the SDSS by Max Tegmark, 
Yongzhong Xu, and Ryan Scrantor0. It uses the natural 
SDSS stripe geometry to divide the sky into pixels aligned 
with the SDSS survey coordinates, eta/lambda. Pixels at a 



See |http://dls. physics. ucdavis.edu/~scranton/SDSSPix/ 1 
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particular resolution have a constant width in eta, and a 
variable width in lambda to satisfy the equal area require- 
ment. While SDSSPix is useful because of the alignment of 
pixels with survey boundaries which makes seeing and red- 
dening in pixels easier to quantify, the elongation of pixels 
away from the survey center interfered with the convergence 
properties of our algorithm described below. This is because 
elongated pixels smooth density variations preferentially in 
the direction of elongation while retaining that information 
in the perpendicular direction. This increases the covariance 
between the smaller scale modes and drives increasing oscil- 
lation in high £ bandpowers with each iteration. 

HE ALPix is also a hier archical, equal area pixelization 
scheme l|G6rski et al.|[2005l ). created for CMB experiments 
such as WMAP and Planck. It divides the sphere into 12 pix- 
els at the base resolution, and higher resolutions recursively 
quarter these large pixels. The benefit of using HEALPix is 
that while pixel boundaries have no relation to our obser- 
vational data, the pixels are not elongated as they are with 
SDSSPix. Due to the stability of the quadratic estimation 
method using HEALPix, we have opted to pixelize our data 
with HEALPix for our calculation. 



3.1.2 Pixel Masks 

Masking with HEALPix is more complicated than with 
SDSSPix since pixels may overlap the survey boundaries. 
For unbiased results, any pixel that overlaps a boundary 
must not be considered in the calculation since it may have 
an unphysical overdensity. Thus many pixels on stripe edges 
are masked. We also eliminate pixels that are not contigu- 
ous with the primary SDSS observing footprint. A random 
sample of 100,000 of the pixels not used due to the bound- 
ary are shown in the top panel of Figure |4j we have plotted 
only a sample to prevent obscuration of the coordinate lines. 
Furthermore, we must mask pixels due to areas with poor 
image quality, these pixels are also shown in the bottom of 
Figure 3] Additionally, we remove pixels where the mean 
seeing is more than 1.5 arcseconds, and pixels where the 
mean reddening is greater than 0.2, shown in the top and 
bottom panels of Figure [S] respectively. 



3.2 Selecting Bandpowers 

The first step in our approach is to select the initial 
fine bandpowers. Multipole resolution is limited by A£ « 
180° /(j>, where (f) is the analyzed area's smallest angular di- 
mension (|Peebleslll980l ). For this reason, we want the broad- 
est survey possible. Aside from being restricted to choosing 
bandpowers wider than this limit, the choice of the start- 
ing, ending, initial value, and widths of each bandpower is 
unrestrained, although some choices of initial values may 
cause non-convergence or singular matrices. We chose initial 
bandpowers of equal widths, each 5£ wide for the full sample 
and 20£ wide for the individual stripes. We use initial values 
based on a prior angular power spectrum; however, since the 
quadratic estimation method uses iteration, the final result 
is fairly insensitive to the input angular power spec trum. We 
assum e all Ci within a band to be constant (jHuterer et al.l 
I2OOII ): 



Boundary 




Figure 4. Top, the HEALPix pixels removed for being outside 
the chosen SDSS boundary, for clarity we have plotted a ran- 
dom sample of the masked pixels. Bottom, the HEALPix pixels 
removed due to poor imago quality. 



_ £{£+l)C, ^ 
Ce = — = 2^ Xb(e)Cb (2) 

b 

where Xb(i) ~ 1 while £ £ b and zero ot herwise, and we d efine 
Ct according to standard convention l|Bond et al.ll2000l ). 

We start with an initial fine binning, to determine where 
the power is inside the larger bandpowers that we later use. 
The Fisher information matrix (defined in Equation [S| is 
used to construct the bandpower window functions, and af- 
ter we have performed the quadratic estimation to find the 
maximum likelihood, we will use these window functions to 
determine the correlation between bandpowers and individ- 
ual multipole moments £. 
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Figure 5. Top, HEALPix pixels removed for liigli seeing. Bottom, 
HEALPix pixels removed due to high reddening. 



3.3 Calculating Cb 

Using only a knowledge of the survey geometry (or at least 
the region under consideration) and the assumed values for 
the bandpowers, we construct the covariance matrix C: 



Ci 



{x^Xj) = S + N 



(3) 



where, S is the signal matrix and N is the noise matrix. 
The assumed bandpower values Cb will only be approxi- 
mate, which will make the covariance matrix approximate; 
but this covariance matrix will be compared to the data 
and iteratively corrected to converge to the true bandpower 
values. The signal matrix is calculated directly from the pix- 
elated survey geometry using the assumed set of multipole 
values Ce. Using Legendre polynomials Pe as the variance 
wi ndow functions, the calculated signal matrix S as shown 
bv lTegmarkI (Il997l ) is: 



21+1 
2£{£ + 1) 



CePi{cos 9ij)e 



b 



CtPb. (4) 



where Oij is the angle between pixels i and j. The expo- 
nential factor is introduced to compensate for the smearing 
caused by a beam of width r. For pixels much larger than 
the beam, as is the case for a galaxy survey, this factor is 
negligible. The noise matrix, N, is modeled as a G aussian 
random process and is diagonal (|Huterer et al.|[200ll ): 



G 



Sij, 



(5) 



where ai is the rms noise in pixel i. 



3.3.1 Karhunen-Loeve Compression 

Rather than perform the full calculation on the vector of 
overdensities x, we instead choose to transform into a sig- 
nal to noise basis. This is done by using KL-compression 
(Vpgeley & Szalay 1996; Tegmark ct al. 1997). While this is 
often useful for data compression, with the high signal in our 
sample very few modes are discarded due to having greater 
noise than signal. 

We begin by solving the generalized eigenvalue equa- 
tion; 



Sb, = A,Nb, 



(6) 



and normalizing such that Nbi — 1. We reorder the vec- 
tors hi by the signal to noise ratio, Ai, in descending order. 
We discard modes with insufficient signal to noise, and we 
choose to keep those with A; > 1. The remaining vectors 
form the columns of the matrix B' that we use to transform 
the data vector x' = B'"^x, as well as the signal, Legendre 
polynomial, and noise matrices S' = B'^SB', P' = B''^PB', 
and N' = B'^NB' (T02). 



3.3.2 Quadratic Estimation 

From the new data vector x', we perform the outer product 
to calculate the observed covariance matrix, x'x'"'^, which 
will be compared to the constructed covariance matrix C' = 
S' + N'. 

Now that we have a set of bandpowers that we want to 
determine, we calculate the Cb that have the highest proba- 
bility of creating the observed data. A complete calculation 
of the likelihood function, although slow, is possible, but a 
local maximum can be found by using iteration with the 
following estimator (BJK98): 



SCb = -(F^ 



-1/2n 



Tr[(x'x'^-N')(C'-ip^,C'-i)] 



where, the Fisher information matrix F is defined as: 
1 



F, 



bb' 



TrfC'-^PlC'-^P;,,^ 



(7) 



(8) 



Equation |8] provides the mechanism by which we can 
compare the covariance matrix obtained from the data x'x'"^ 
with the constructed covariance matrix C'. What this equa- 
tion accomplishes is retrieving the Cb that produce a covari- 
ance matrix C that is identical to x'x'"^. Note t hat we use 
F"^''^ in Equation [7| as advocated by iTegmarl3 (| 19981 ) for 
uncorrelated error bars and well behaved window functions. 
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By making an initial estimate of Ct, and iteratively ap- 
plying this equation, the estimator quickly converges on a 
maximally probable set of bandpower values. The error in 
bandpower b, given by ai, = y/{F^^bb, is the smallest er- 
ror any estimator can measure while estimating parameters 
from the sample itself due to the Cramer-Rao inequality 
l|Kennev et al.lll95ll : lTegmark|[l997l ). 

3.3.3 Computational Requirements 

The quadratic estimation method is computationally com- 
plex, due to both a large amount of calculation required 
for matrix operations as well as large memory requirements 
to store these matrices. We must consider computational 
feasability when making choices about the extent of the data 
that we will analyze. At the scales of interest, we have found 
the processing time for a single processor scales as: 

— -(s)(T)fey' <»' 

and the memory requirements scale as: 



M fa 60 GB ( — 
\40 



6836 



(10) 



where Ub, Ui, and rip are the number of bandpowers, itera- 
tions, and pixels respectively. Typically only a few iterations 
are necessary; we allow 3 iterations to achieve convergence. 
These are obviously highly dependent on the number of pix- 
els Up, and processing time and memory rcquircmeiits be- 
come prohibitive much beyond 10* pixels (Borrill 1999). As 
a result, we have made use of the National Center for Super- 
computing Applications' (NCSA) 1,024 processor SGI Altix 
(Cobalt), its successor the 1,536 processor SGI Altix (Em- 
ber), as well as the Pittsburg Supercomputing Center's 768 
core SGI Altix (Pople) and 4,096 core SGI UV 1000 (Black- 
light) for these calculations. 

3.4 Interpreting Cb 

3.4.1 Averaging Cb 

After defining the bandpowers and calculating the Cb, we 
use the Fisher Information matrix to d etermine the correla- 
tion between bandpowers (|Knoxlll998l ). Narrow bandpower 
window functions are preferred so that the error in one band 
measurement minimally affects other bands. 

Though the Fisher matrix and Cb have already been 
calculated for the choice of bandpowers, we want to have a 
method of combining bandpowers to improve the signal-to- 
noise without recalculating using the computationally de- 
manding quadratic estimator method. For this we use the 
BJK98 method. 

First, smaller bandpowers 6 are averaged together into 
larger bandpowers B (not to be confused with the KL- 
compression matrix B defined earlier) using Equation 1111 
We can combine any number of adjacent bandpowers to im- 
prove signal-to-noise, though combining bandpowers from 
sections of the angular power spectrum with significant 
structure will result in a loss of resolution in the areas of 
interest (BJK98). 



_ ^beB Eb'sB' ^bFbb' 



fees b'SB' 



(12) 



The averaged Fisher matrix must be calculated to de- 
termine the er rors on Cb, which are as = \/ {F~^)bb 
(|Tegmarklll997l) . 



3.4.2 Calculating Window Functions 

To represent the angular power spectrum visually, the data 
points are characterized not only by the values and errors, 
but also by the width and position of the bandpowers they 
represent. The bandpower window functions are given by 
(T02): 



W = DF 



1/2 



(13) 



where D is the diagonal matrix that makes the rows of W 
sum to unity. The midpoints of the bandpowers, ieff, can 
also be calculated. Algorithmically, i^ff is where half the 
power in the band comes from below and half from above 
that muhipole (BJK98): 



fE 



beB 
J2beB^fBb 

J2beB fsb 



(14) 



(15) 



b'SB' 



Fb, 



(11) 



We calculate the filter fsb while doing the averaging 
in Section [3.4. II This filter function tells us how the power 
in larger bands is related to the power in the component 
smaller bands, and gives us information about how the 
power is distributed within the new larger bands (BJK98). 
The edges of the band, £~ and £^ , are defined to be where 
ifBb drops to e~^^^ of the peak power, and we plot these as 
horizontal error bars. The angular power spectrum at £5// 
can be plotted with horizontal error bars ranging from £^ 
to i'^ , with value Cb and vertical error bars ±^/{F-'^) BB- 



4 THE SDSS ANGULAR POWER SPECTRUM 

The results of our angular power spectrum calculation for 
stripe 10 for £ < 1000 are shown in the top panel of Figure 
[S] separated by magnitude. Though our results are consis- 
tently higher than those in T02 in all samples, we find that 
our results are still in agreement. This is due to a known 
magnitude calculation error in early SDSS data, which mis- 
calculated galaxy mod el magnitudes by roughly 0.2 mag 
( Abazaiian_ et alj|2004l ). When we shift the samples by 0.2 
magnitudes to account for this difference, our results match 
very well with the previous results, typically within one 
standard deviation as shown in the bottom panel of Figure 
[HI Additionally, as we are using DR7 instead of the EDR, 
galaxy counts versus galaxy probabilities, and HEALPix 
rather than SDSSPix, we do not expect the results to ex- 
actly coincide. 

In addition, we not only need to know the final Ce, but 
to completely characterize the errors and the structure of 
each bandpower, we need to know the window functions. 
The variance and covariance of the Ce are derived from the 
Fisher matrix, and the bandpower window functions show 
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what I the power in a band comes from, so we prefer band- 
power window functions to be as narrow as possible. For 
iUustration and comparison to T02, the bandpower window 
functions for the 18-19 magnitude bin of stripe 10 are shown 
in the top panel of Figure [7] We see that at about I ~ 750, 
the window functions become wider signifying that our sig- 
nal has dropped below shot noise fluctuations, so bands be- 
yond that are not used. In the other magnitude bins, our 
signal does not drop below shot noise fluctuations out to 
I = 1000 such as in the bottom panel of Figure [T] The 
window functions for other stripes are similar, and we have 
made these available onlinslfl. 

The results of the angular power spectrum of our entire 
sample for i < 200, as well as for our magnitude separated 
subsamples, are summarized in Figure |8] and in Table 1. 
The brightest and on average closest galaxies in the 18- 
19 r-band magnitude bin are the most highly clustered at 
all £ as expected. Below that is the 19-20 magnitude bin, 
and the least clustered at all £ is the 20-21 magnitude bin. 
Also plotted are the linear theoretical angular power spectra 
discussed in SectionFS.llfor £ < 90. 




400 600 800 

Multipole I 



5 THEORY 

5.1 Theoretical Power Spectra 

The statistical characterizations of galaxy clustering pro- 
vided by our angular power spectrum measurements are only 
the first step. In order to constrain models of structure for- 
mation, we must compare these results to theoretical linear 
angular power spectra. To obtain theoretical Cj, we project 
the linear 3D power spectrum Pjk), modeled with the fitting 
formulae of lEisenstein fc HrJ () 19981 . down to two dimen- 
sions. With P(k), we can calculate the Cf we expect from a 
:iven theory (e.g., iHuterer et al]|200ll 'l. From [Crocce et al] 
2010l ) we have the exact calculation for the theoretical lin- 
ear angular power spectrum: 



Cj = £{£+l)/Tr'' 



where: 



k'^P{k)^i{kf dk 



^i{k) = / (l>{z)D{z)ji{kr{z)) b dz 



(16) 



(17) 



(18) 



= = -r 

G dz 

where D{z) is the growth function jCarroU et al.|[l993 ) and 
jiikr) are Bessel functions, b is the bias, and r and 'g are the 
comoving distance and number density res pectively. This 
simplifies if we use Limber's approximation (|Limberlll953l ) 
to simplify the calculation of the Bessel functions: 



27r 



£{£+!) 



^\z)D\ 



)P(- 



1/2 H{z) 



(^) 



(19) 



The theoretical power spectrum depends only on cos- 
mological parameters through the 3D power spectrum and 
the bias, so we can use this dependence to infer constraints 
on these values. The only knowledge it requires about the 



All results discussed in this paper are available 
http: I / lcdm.astro.illinois.edu/research/aps. html 



at 




stripe 10, 18.2-19.2 
Stripe 10. 19.2-20,2 
Stripe 10, 20.2-21.2 
Tegmarii, 18-19 
Tegmarl^, 19-20 
Tegmarl^, 20-21 
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Figure 6. The top panel shows the angular power spectra of 
the 3 magnitude cuts on stripe 10. The bottom panel shows the 
magnitude shifted angular power spectrum in comparison with 
the results of T02. 



sample is the redshift distribution. We calculate the red- 
shift distribution by assuming the redshift of each galaxy is 
distributed as a Gaussian with mean equal to the observed 
photometric redshift and standard deviation equal to the er- 
ror of the photometric redshift. We sample the distribution 
of each galaxy and then weight by volume and luminosity 
function constraints as in iRoss et al.l ||2010| ) with the lumi- 
nosity function of iMontero- Porta fc Prada^(|2009^ . 

In Figure [9] we show the photometric redshift distri- 
bution of our main sample of over 18 million galaxies, sep- 
arated into photometric redshift bins of width 0.001 with 
0.0 < z < 1.0. We see that the peak of the sample is at 
z ~ 0.2 and falls off rapidly past z ~ 0.3. The redshift 
distribution is important because we must use it when we 
project the 3D power spectra to compare to our angular 
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Table 1. The SDSS Angular Power Spectrum for our entire sample and each of the 3 magnitude subsamples. ieff is the point in the 
band where half the power is from £ < ieff and half the power is from £ > £eff, not necessarily the center of the band. 



power spectra. Also in Figure (9] we have separated the red- 
shift distribution into magnitude bins, and see the variations 
of photometric redshift distributions by magnitude, with the 
brighter bins being on average closer than the fainter bins. 
The average redshifts of these samples are z = 0.171 for the 
18-19 magnitude bin, z = 0.217 for 19-20, z = 0.261 for 
20-21, and z = 0.243 for the entire sample. 



[TOl We find our best fit fi™ = 0-3lto,ll and b = 0.94 ± 0.04 
for the 18-21 sample, fi™ = 0.26^^1^ and b = 1.09±0.05 for 
the 18-19 magnitude subsample, Qm = 0.261q ijj^ and 6 = 
1.03 ± 0.04 for the 19-20 magnitude subsample, and Qm = 
0.33lU:io and b = 0.92 ± 0.04 20-21 magnitude subsample. 
We display these best-fit models against our measurements 
in Figure \8\ 



5.2 Fitting Theory to Data 

To constrain cosmological parameters, we use a fitting 
technique to determine the calculated theoretical linear an- 
gular power spe ctrum that best fits the observed bandpower 
measurements l|TegmarkllT997l '). First, an average over the 
chosen bandpowers of the n ewly calcula ted Cj is made so 
that these can be compared (|KnoxlllS)9^ ): 



E 

B' 



(20) 



with the bandpower window function Wbb' from Equation 
1131 We evaluate the following where F is the Fisher ma- 
trix and flp are the cosmological parameters l|Bond et al.l 
l200d ): 

X^{ap)=^{\nCB-\nCl)CBFBB'CB- (InCs- -lnCiO(21) 



We assume a flat cosmology a nd the WMAP bar yon 
to matter ratio of Q.h/^m = 0.168 (|Larson et al.|[201ll ) to 
perform this minimization for I < 90. Over this range, the 
equivalent k is less than 0.16 h/Mpc at our median redshift 
of ^ 0.2; and, we therefore expect the linear P{k) to be a 
good approximation. We note that, given the limited range 
of the data used with this cut, the ^ < 90 restriction is 
not likely to yield competitive constraints on Q.m, and to 
fit the data past i! = 90 we would need to use a non-linear 
power spectrum. Indeed, we find a wide range of allowed f2,„ 
values, which we illustrate by displaying the results of our 
minimization for the 18-21 magnitude sample in Figure 



6 DISCUSSION 



Comparing the observed angular power spectrum to a lin- 
ear theoretical spectrum is not expected to provide strong 
constraints on Q,m since varying Q.m primarily changes the 
angular power spectrum at higher I. Though weakly con- 
strained, these measurements of Q.m are consistent with 
other recent measurem ents of fr o m ga la xy angular 
ower spectra such as iHuterer et al] (1200 [ Frith et al.l 
2005h : iBlake et al.l (|2007^ : iThomas et all (|2010l ). as well 
as measurements through other methods such as the 7- 
year WMAP result s from the cosmic microwave background 
(|Larson et al.|[201ll ). This agreement confirms that the sam- 
ples of galaxies and the measurement techniques we use have 
no large systematic errors. 

If we assume that the primordial fluctuations that 
seeded th e large scale structure that we see today were Gaus- 
sian re.g.. lGuth|[l98ll ). the angular power spectrum contains 
all clustering information on linear scales. However, there 
has been some evidence that this might not be the case 
(e.g., lElsner fc Wandeltll2010l ). Furthermore, non- linear ef- 
fects from gravitational collapse become more pronounced 
at higher I, which also causes a departure from Gaussianity. 
Though the quadratic estimator we employ assumes Gaus- 
sian fluctuations, the maximum likelihood angular power 
spectrum values we determine are unaffected by potential 
non-Gaussianities in the galaxy density field. We note, how- 
ever, that the presence of such non-Gausianities would gen- 
erally cause us to underestimate our error bars (T02). 

As we estimate the mean galaxy density from the survey 
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Figure 7. Window Functions - Thic window functions of each of 
the 50 bands, for the 18-19th bin (top) and 20-21st magnitude 
bin (bottom) of stripe 10. 



itself, we constrain the data vector x to have zero mean; this 
is known as the integral constraint fsee lTegmark et al]|l998l 
for a detailed discussion) . If we fail to account for the integral 
constraint we can u nderestimate the power on large scales 
l|Huterer et al.ll200lh . so we correct for this by adding a large 
number M to the mean mode in the noise matrix N before 
KL-compression. The KL-compression stage will determine 
that the signal-to-noise of the mean mode is low and it will 
be discarded with other low signal-to-noise modes. 

The major limitation of our adopted approach for the 
calculation of the galaxy angular power spectrum is the 
computational difficulty. The signal-to-noise of the SDSS 
DR7 is sufficient to calculate the angular power spectrum to 
smaller scales than we have here, but doubling the resolution 
quadruples the number of pixels to Up ~ 25, 000. Since the 



Figure 8. Angular Power Spectra - The spectrum of stripes 9 to 
37, magnitudes 18-21 in black, 18-19 in red, 19—20 in green, and 
20-21 in blue. The solid lines are the best-fit theoretical linear 
power spectrum for £ < 90. 




0.4 0.6 
Pliotometric Redshift 



Figure 9. The normalized photometric redshift distribution of 
all galaxies in stripes 9 to 37, from magnitude 18-21 in black, 
magnitude 18-19 in red, 19—20 in green, and 20-21 in blue. 



matrix muliplication and inversion scales as 0{n'^), doubling 
the resolution is a 64-fold increase in computation, which is 
beyond our current computational resources, though we are 
looking into the possibility of performing this calculation, 
perhaps by KL-compressing the data even further. 

We have also explored using alternative platforms to ac- 
celerate the computation. We have implemented this method 
on Graphics Processing Units (GPUs), which are part of ev- 
ery modern personal computer. GPUs are specifically de- 
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Figure 10. The black point at Qrn = 0.31, b = 0.94 is the mini- 
mum of the test for the entire sample, the area in red covers 
the 68% confidence level. 



signed to parallelize simple computations across many small 
multiprocessors, which make it ideal for vector and matrix 
calculations. Using an Nvidia 8800 GTX and transferring 
the matrix operations to the GPU, while the rest of the 
code ran on the CPU, proved to be very effective at accel- 
erating the quadratic estimation section of this calculation, 
speeding it up by a factor of 337. For this to be effective, 
however, the matrices had to fit into the relatively small on 
board memory of the GPU, which in our test system was 
768 MB. In comparison, the memory required by the calcu- 
lation performed in this paper was roughly 75 GB. So while 
this platform seems very promising in accelerating this com- 
putation, the memory available will not be sufficient in the 
near future to allow us to meet or exceed the calculations 
that can be performed using current supercomputers. 

While important, this work has merely been the first 
step. By applying this method to volume-limited samples, 
we can constrain the redshift evolution of the galaxy an- 
gular power spectrum. In addition, we can use the photo- 
metric galaxy type classification to distinguish differences 
in the clustering properties of early- and late-type galaxies 
in different redshift shells. Furthermore, by utilizing a full 
3D, nonlinear theoretical power spectrum, we can model our 
measurements to higher £ values and make more stringent 
measurements of cosmological parameters and we plan on 
taking these steps in a future work. 



7 CONCLUSIONS 

We have used the quadratic estimation method with KL- 
compression to determine the SDSS DR7 angular power 
spectrum, first as a means of radical compression of the an- 
gular clustering information, and second to match these ob- 
served angular power spectra with theoretical angular power 
spectra to extract the linear bias and cosmological matter 
density. We masked for observational effects and applied this 
method to over 18 million SDSS DR7 galaxies and three 
magnitude subsamples out to ^ < 200. We also measured 
the angular power spectrum for each individual stripe out 
to ^ < 1000 for stripes 9-37. We have used the photomet- 



ric redshift distribution of these galaxies to project the 3D 
power spectrum to two dimensions to obtain theoretical lin- 
ear angular power spectrum, and used minimization to 
determine the best fit parameters given the observations. 
As the linear angular power spectrum approximation is not 
valid for the entire range of our estimated angular power 
spectrum, these parameter constraints have a large allowed 
range of values. 

We found that the linear bias of our samples was b = 
1.09±0.05 in the 18-19 magnitude range, b = 1.03±0.04 for 
19-20, and b = 0.92 ± 0.04 for 20-21, with an overall bias of 
b = 0.94 ± 0.04 for our combined 18-21 magnitude sample. 
We have also calculated the cosmological density of matter 
as Qrn. ~ 0.3lj;[5'ji from our entire sample. 
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